A dirichlet process covarion mixture model and its assessments using posterior predictive discrepancy tests.
نویسندگان
چکیده
Heterotachy, the variation of substitution rate at a site across time, is a prevalent phenomenon in nucleotide and amino acid alignments, which may mislead probabilistic-based phylogenetic inferences. The covarion model is a special case of heterotachy, in which sites change between the "ON" state (allowing substitutions according to any particular model of sequence evolution) and the "OFF" state (prohibiting substitutions). In current implementations, the switch rates between ON and OFF states are homogeneous across sites, a hypothesis that has never been tested. In this study, we developed an infinite mixture model, called the covarion mixture (CM) model, which allows the covarion parameters to vary across sites, controlled by a Dirichlet process prior. Moreover, we combine the CM model with other approaches. We use a second independent Dirichlet process that models the heterogeneities of amino acid equilibrium frequencies across sites, known as the CAT model, and general rate-across-site heterogeneity is modeled by a gamma distribution. The application of the CM model to several large alignments demonstrates that the covarion parameters are significantly heterogeneous across sites. We describe posterior predictive discrepancy tests and use these to demonstrate the importance of these different elements of the models.
منابع مشابه
Posterior Predictive Assessment of Model Fitness via Realized Discrepancies
This paper considers Bayesian counterparts of the classical tests for goodness of fit and their use in judging the fit of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on auxiliary statistics. The Bayesian formulation facilitates the construction and calculation of a meaningful reference distribution not...
متن کاملPosterior Predictive Assessment of Model
This paper considers the Bayesian counterparts of the classical tests for goodness of t and their use in judging the t of a single Bayesian model to the observed data. We focus on posterior predictive assessment, in a framework that also includes conditioning on ancillary statistics. The Bayesian formulation facilitates the construction and calculation of a meaningful reference distribution not...
متن کاملGeneralized Weighted Chinese Restaurant Processes for Species Sampling Mixture Models
The class of species sampling mixture models is introduced as an extension of semiparametric models based on the Dirichlet process to models based on the general class of species sampling priors, or equivalently the class of all exchangeable urn distributions. Using Fubini calculus in conjunction with Pitman (1995, 1996), we derive characterizations of the posterior distribution in terms of a p...
متن کاملThe Dirichlet Labeling Process for Functional Data Analysis
We consider problems involving functional data where we have a collection of functions, each viewed as a process realization, e.g., a random curve or surface. For a particular process realization, we assume that the observation at a given location can be allocated to separate groups via a random allocation process, which we name the Dirichlet labeling process. We investigate properties of this ...
متن کاملThe Dirichlet Labeling Process for Clustering Functional Data
We consider problems involving functional data where we have a collection of functions, each viewed as a process realization, e.g., a random curve or surface. For a particular process realization, we assume that the observation at a given location can be allocated to separate groups via a random allocation process, which we name the Dirichlet labeling process. We investigate properties of this ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Molecular biology and evolution
دوره 27 2 شماره
صفحات -
تاریخ انتشار 2010